Model Based Steganography with Precover

نویسنده

  • Tomáš Denemark
چکیده

It is widely recognized that steganography with sideinformation in the form of a precover at the sender enjoys significantly higher empirical security than other embedding schemes. Despite the success of side-informed steganography, current designs are purely heuristic and little has been done to develop the embedding rule from first principles. Building upon the recently proposed MiPOD steganography, in this paper we impose multivariate Gaussian model on acquisition noise and estimate its parameters from the available precover. The embedding is then designed to minimize the KL divergence between cover and stego distributions. In contrast to existing heuristic algorithms that modulate the embedding costs by 1–2|e|, where e is the rounding error, in our model-based approach the sender should modulate the steganographic Fisher information, which is a loose equivalent of embedding costs, by (1–2|e|)^2. Experiments with uncompressed and JPEG images show promise of this theoretically well-founded approach. Introduction Steganography is a privacy tool in which messages are embedded in inconspicuous cover objects to hide the very presence of the communicated secret. Digital media, such as images, video, and audio are particularly suitable cover sources because of their ubiquity and the fact that they contain random components, the acquisition noise. On the other hand, digital media files are extremely complex objects that are notoriously hard to describe with sufficiently accurate and estimable statistical models. This is the main reason for why current steganography in such empirical sources [3] lacks perfect security and heavily relies on heuristics, such as embedding “costs” and intuitive modulation factors. Similarly, practical steganalysis resorts to increasingly more complex high-dimensional descriptors (rich models) and advanced machine learning paradigms, including ensemble classifiers and deep learning. Often, a digital media object is subjected to processing and/or format conversion prior to embedding the secret. The last step in the processing pipeline is typically quantization. In side-informed steganography with precover [21], the sender makes use of the unquantized cover values during embedding to hide data in a more secure manner. The first embedding scheme of this type described in the literature is the embedding-while-dithering [14] in which the secret message was embedded by perturbing the process of color quantization and dithering when converting a true-color image to a palette format. Perturbed quantization [15] started another direction in which rounding errors of DCT coefficients during JPEG compression were used to modify the embedding algorithm. This method has been advanced through a series of papers [23, 24, 29, 20], culminating with approaches based on advanced coding techniques with a high level of empirical security [19, 18, 6]. Side-information can have many other forms. Instead of one precover, the sender may have access to the acquisition oracle (a camera) and take multiple images of the same scene. These multiple exposures can be used to estimate the acquisition noise and also incorporated during embedding. This research direction has been developed to a lesser degree compared to steganography with precover most likely due to the difficulty of acquiring the required imagery and modeling the differences between acquisitions. In a series of papers [10, 12, 11], Franz et al. proposed a method in which multiple scans of the same printed image on a flat-bed scanner were used to estimate the model of the acquisition noise at every pixel. This requires acquiring a potentially large number of scans, which makes this approach rather labor intensive. Moreover, differences in the movement of the scanner head between individual scans lead to slight spatial misalignment that complicates using this type of side-information properly. Recently, the authors of [7] showed how multiple JPEG images of the same scene can be used to infer the preferred direction of embedding changes. By working with quantized DCT coefficients instead of pixels, the embedding is less sensitive to small differences between multiple acquisitions. Despite the success of side-informed schemes, there appears to be an alarming lack of theoretical analysis that would either justify the heuristics or suggest a well-founded (and hopefully more powerful) approach. In [13], the author has shown that the precover compensates for the lack of the cover model. In particular, for a Gaussian model of acquisition noise, precover-informed rounding is more secure than embedding designed to preserve the cover model estimated from the precover image assuming the cover is “sufficiently non-stationary.” Another direction worth mentioning in this context is the bottom-up model-based approach recently proposed by Bas [2]. The author showed that a high-capacity steganographic scheme with a rather low empirical detectability can be constructed when the process of digitally developing a RAW sensor capture is sufficiently simplified. The impact of embedding is masked as an increased level of photonic noise, e.g., due to a higher ISO setting. It will likely be rather difficult, however, to extend this approach to realistic processing pipelines. Inspired by the success of the multivariate Gaussian model in steganography for digital images [25, 17, 26], in this paper we adopt the same model for the precover and then derive the embedding rule to minimize the KL divergence between cover and stego distributions. The sideinformation is used to estimate the parameters of the acquisition noise and the noise-free scene. In the next section, we review current state of the art in heuristic side-informed steganography with precover. In the following section, we introduce a formal model of image acquisition. In Section “Side-informed steganography with MVG acquisition noise”, we describe the proposed model-based embedding method, which is related to heuristic approaches in Section “Connection to heuristic schemes.” The main bulk of results from experiments on images represented in the spatial and JPEG domain appear in Section “Experiments.” In the subsequent section, we investigate whether the public part of the selection channel, the content adaptivity, can be incorporated in selection-channel-aware variants of steganalysis features to improve detection of side-informed schemes. The paper is then closed with Conclusions. The following notation is adopted for technical arguments. Matrices and vectors will be typeset in boldface, while capital letters are reserved for random variables with the corresponding lower case symbols used for their realizations. In this paper, we only work with grayscale cover images. Precover values will be denoted with xij ∈ R, while cover and stego values will be integer arrays cij and sij , 1 ≤ i ≤ n1, 1 ≤ j ≤ n2, respectively. The symbols [x], dxe, and bxc are used for rounding and rounding up and down the value of x. By N (μ,σ2), we understand Gaussian distribution with mean μ and variance σ2. The complementary cumulative distribution function of a standard normal variable (the tail probability) will be denoted Q(x) = ∫∞ x (2π)−1/2 exp ( −z2/2 ) dz. Finally, we say that f(x)≈ g(x) when limx→∞ f(x)/g(x) = 1. Prior art in side-informed steganography with precover All modern steganographic schemes, including those that use side-information, are implemented within the paradigm of distortion minimization. First, each cover element cij is assigned a “cost” ρij that measures the impact on detectability should that element be modified during embedding. The payload is then embedded while minimizing the sum of costs of all changed cover elements, ∑ cij 6=sij ρij . A steganographic scheme that embeds with the minimal expected cost changes each cover element with probability βij = exp(−λρij) 1 +exp(−λρij) , (1) if the embedding operation is constrained to be binary, and βij = exp(−λρij) 1 +2exp(−λρij) , (2) for a ternary scheme with equal costs of changing cij to cij ± 1. Syndrome-trellis codes [8] can be used to build practical embedding schemes that operate near the rate–distortion bound. For steganography designed to minimize costs (embedding distortion), a popular heuristic to incorporate a precover value xij during embedding is to modulate the costs based on the rounding error eij = cij − xij , −1/2≤ eij ≤ 1/2 [23, 29, 20, 18, 19, 6, 24]. A binary embedding scheme modulates the cost of changing cij = [xij ] to [xij ] + sign(eij) by 1−2|eij |, while prohibiting the change to [xij ]− sign(eij): ρij(sign(eij)) = (1−2|eij |)ρij (3) ρij(−sign(eij)) = Ω, (4) where ρij(u) is the cost of modifying the cover value by u∈ {−1,1}, ρij are costs of some additive embedding scheme, and Ω is a large constant. This modulation can be justified heuristically because when |eij | ≈ 1/2, a small perturbation of xij could cause cij to be rounded to the other side. Such coefficients are thus assigned a proportionally smaller cost because 1− 2|eij | ≈ 0. On the other hand, the costs are unchanged when eij ≈ 0, as it takes a larger perturbation of the precover to change the rounded value. A ternary version of this embedding strategy [6] allows modifications both ways with costs: ρij(sign(eij)) = (1−2|eij |)ρij (5) ρij(−sign(eij)) = ρij . (6) Some embedding schemes do not use costs and, instead, minimize statistical detectability. In MiPOD [25], the embedding probabilities βij are derived from their impact on the cover multivariate Gaussian model by solving the following equation for each pixel ij: βijIij = λ ln 1−2βij βij , (7) where Iij = 2/σ̂4 ij is the Fisher information with σ̂ 2 ij an estimated variance of the acquisition noise at pixel ij, and λ is a Lagrange multiplier determined by the payload size. To incorporate the side-information, the sender first converts the embedding probabilities into costs and then modulates them as in (3) or (5). This can be done by reversing the formula for optimal embedding probabilities for ternary cost-based schemes (2): ρij = ln ( 1/βij −2 ) . (8) When reversing (2) λ can be set to 1 because multiplying costs by a positive scalar does not change the embedding scheme. Modeling acquisition An image x acquired using an imaging sensor has two components – the true scene t and acquisition imperfections (noise) n:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the role of side information in steganography in empirical covers

In an attempt to alleviate the negative impact of unavailable cover model, some steganographic schemes utilize the knowledge of the so-called “precover” when embedding secret data. The precover is typically a higherresolution (unquantized) representation of the cover, such as the raw sensor output before it is converted to an 8-bit per channel color image. The precover object is only available ...

متن کامل

A New Hybrid Method for Colored Image Steganography Based On DWT

Data transmission security has become an extremely important field of research. Steganography is an art of hiding information in image, audio and video files in a way that would meet the security requirements in the form of overt or covert. In this study, we propose a new hybrid steganography technique for color images that hide secret messages in the frequency domain of a cover image's blu...

متن کامل

Detection of perturbed quantization (PQ) steganography based on empirical matrix

Perturbed Quantization (PQ) steganography scheme is almost undetectable with the current steganalysis methods. We present a new steganalysis method for detection of this data hiding algorithm. We show that the PQ method distorts the dependencies of DCT coefficient values; especially changes much lower than significant bit planes. For steganalysis of PQ, we propose features extraction from the e...

متن کامل

A New Hybrid Method for Colored Image Steganography Based On DWT

Data transmission security has become an extremely important field of research. Steganography is an art of hiding information in image, audio and video files in a way that would meet the security requirements in the form of overt or covert. In this study, we propose a new hybrid steganography technique for color images that hide secret messages in the frequency domain of a cover image's blu...

متن کامل

Steganography Scheme Based on Reed-Muller Code with Improving Payload and Ability to Retrieval of Destroyed Data for Digital Images

In this paper, a new steganography scheme with high embedding payload and good visual quality is presented. Before embedding process, secret information is encoded as block using Reed-Muller error correction code. After data encoding and embedding into the low-order bits of host image, modulus function is used to increase visual quality of stego image. Since the proposed method is able to embed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017